Notebook 3: bandle
1 Overview
This notebook contains the R code used to perform differential localisation
analyses using the bandle package.
1.1 Install and load packages
Install R packages needed for data processing -
if (!require("BiocManager", quietly = TRUE)) {
install.packages("BiocManager")
}
BiocManager::install(c("bandle",
"coda",
"tictoc",
"here"))Load packages
1.2 Load data
1.3 The bandle package
In this analyses we use the bandle package to perform differential
localisation analyses. bandle which stands for Bayesian ANalysis of
Differential Localisation, is a method developed by Crook et al. in 2022 which
implements integrative semi-supervised functional mixture model, to obtain the
probability of a protein being differentially localised between two conditions.
2 Preparing data for bandle
2.1 Data formatting
bandle requires two lists of MSnSets one for each condition, containing
the common proteins between all conditions and replicates
## put the MSnSets into a "list"
quant_list <- list(basal_prots[[1]], basal_prots[[2]], basal_prots[[3]],
insulin_prots[[1]], insulin_prots[[2]], insulin_prots[[3]])
## Keep common proteins across all replicates and conditions
quant_list <- commonFeatureNames(quant_list)## 4005 features in common
2.2 Fitting Gaussian Process
First we fit non-parametric regression functions to the markers profiles.
2.2.1 Prior
The prior needs to form a K*3 matrix (where K is the number of organelle classes in the data).
One column is for the prior, one for the length-scale amplitude and one for the SD parameters.
2.2.2 Fit GP using priors
Now the complexity priors have been generated, we pass them as an argument to the fitGPmaternPC function.
Plot the GP fits
## create labels for the plots
id <- c(paste0("basal-rep", 1:3), paste0("insulin-rep", 1:3))
## now plot all GPs
for (i in seq(all_conditions)) {
# png(file = paste0("3_running_bandle/figures/gpParams_",
# id[i], ".png"), height = 4000, width = 4000,
# units = "px", res = 400)
par(mar=c(2,2,2,2), oma = c(2,2,6,2), mfrow = c(4,3))
plotGPmatern(all_conditions[[i]], params = gpParams[[i]])
mtext(id[i], side = 3, outer = TRUE, font = 2, cex = 2, line =2)
# dev.off()
}2.3 Setting the prior on weights
Dirichlet distribution is a family of continuous multivariate probability
distributions parameterized by a vector of positive reals. It is a multivariate
generalization of the beta distribution.
set.seed(1)
dirPrior = diag(rep(1, K)) + matrix(0.0002, nrow = K, ncol = K)
### sanity check, doesn't actually affect bandle function as only dirPrior goes in
predDirPrior <- prior_pred_dir(object = all_conditions[[1]],
dirPrior = dirPrior,
q = 15)Save the parameters,
3 Running BANDLE
The bandle method is computationally intensive requires a HPC to run. We used
the University of Cambridge HPC to run the analysis. The output bandle run
has been uploaded to Zenodo at xxx with DOI (add URL and DOI).